Listen for SIGTERM/SIGINT, call server.Shutdown() with a deadline context, drain in-flight requests, and signal health checks as unhealthy before stopping to let the load balancer route away.
server.Shutdown() stops accepting new connections and waits for active requests to complete
Set a shutdown deadline (30s) to force-close stuck connections after the timeout
Mark health endpoint unhealthy before shutdown — gives load balancer time to drain
Drain message queue consumers before exit to avoid duplicate processing on restart
Use preStop hook in Kubernetes to add the delay before SIGTERM is sent